Annotating Phonetic Component of Chinese Characters Using Constrained Optimization and Pronunciation Distribution

نویسندگان

  • Chia-Hui Chang
  • Shu-Yen Lin
  • Shu-Ying Li
  • Meng-Feng Tsai
  • Shu-Ping Li
  • Hsiang-Mei Liao
  • Chih-Wen Sun
  • Norden E. Huang
چکیده

Generally speaking, Chinese characters are graphic characters that do not allow immediate pronunciation unless they are accompanied with Mandarin phonetic symbols (zhuyin) or other pinyin methods (e.g. romanization system). In fact, about 80 to 90 percents of Chinese characters are pictophonetic characters which are composed of a phonetic component and a semantic component. Therefore, even if one had not seen the character before, one can make a logical guess at the character's pronunciation and meaning from its phonetic and semantic symbols. In order to analyze such relations, we start by analyzing the characteristics of phonetic components. We found two interesting features that could automatically identify the phonectic components of Chinese characters. One is pronunciation similarity, the other is pronunciation distribution. Experiments show that these two methods have high accuracy (90.8% and 98.1% for 9593 pictophonetic characters) in predicting the phonetic components of pictophonetic characters. These methods can save a lot of time and effort during the annotation of phonetic symbols in the early stage.

برای دانلود رایگان متن کامل این مقاله و بیش از 32 میلیون مقاله دیگر ابتدا ثبت نام کنید

ثبت نام

اگر عضو سایت هستید لطفا وارد حساب کاربری خود شوید

منابع مشابه

Position of phonetic components may influence how written words are processed in the brain: Evidence from Chinese phonetic compound pronunciation.

Previous studies have shown a right-visual-field (RVF)/left-hemisphere (LH) advantage in Chinese phonetic compound pronunciation. Here, we contrast the processing of two phonetic compound types: a dominant structure in which a semantic component appears on the left and a phonetic component on the right (SP characters), and a minority structure with the opposite arrangement (PS characters). We s...

متن کامل

主要漢字形聲字發音規則探勘與視覺化 (Primary Chinese Semantic-Phonetic Compounds Pronunciation Rules Mining and Visualization) [In Chinese]

The demand and the importance of Chinese teaching have increased continuously. In order to assist the Chinese learners in composing Chinese characters and increase their learning efficiency, Chinese components teaching method is adopted. The learners can find the clues to both the pronunciations and the meanings of Chinese characters from Chinese components, and semantic-phonetic compounds and ...

متن کامل

Developing a Chinese L2 speech database of Japanese learners with narrow-phonetic labels for computer assisted pronunciation training

For the purpose of developing Computer Assisted Pronunciation Training (CAPT) technology with more informative feedbacks, we propose to use a set of narrowphonetic labels to annotate Chinese L2 speech database of Japanese learners. The labels include basic units of “Initials”, “Finals” for Chinese phonemes and diacritics for erroneous articulation tendencies. Pilot investigations were made on t...

متن کامل

Phonetic Component Ranking and Pronunciation Rules Discovery for Picto-Phonetic Chinese Characters

In recent years, there are a considerable number of new immigrants in Taiwan. Although these people are in the good position to learn Chinese, the advantages are limited to speaking and listening. Recognizing Chinese characters is a tough task since one has to memorize the shape, meaning and pronunciation at the same time. Therefore, the cost of learning a single character is relatively high co...

متن کامل

以聲符部件為主之漢字學習系統設計研究 (The Design of Chinese Character Learning System Based on Phonetic Components) [In Chinese]

An increasing number of people learn Chinese as second language in the world. About 60% of Chinese characters are picto-phonetic compounds which are composed of a phonetic component (PC) and semantic component. Therefore one can make a guess at a character’s pronunciation and meaning from its phonetic and semantic component for a new character. For this reason we propose an order of phonetic co...

متن کامل

ذخیره در منابع من


  با ذخیره ی این منبع در منابع من، دسترسی به آن را برای استفاده های بعدی آسان تر کنید

عنوان ژورنال:
  • IJCLCLP

دوره 15  شماره 

صفحات  -

تاریخ انتشار 2010